Importing Requried Libraries

In [181]:
import numpy as np
import pandas as pd
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt

Importing Datasets

In [182]:
matches = pd.read_csv("Data/matches.csv")
home_away = pd.read_csv("Data/teamwise_home_and_away.csv")
most_runs = pd.read_csv("Data/most_runs_average_strikerate.csv")
deliveries = pd.read_csv("Data/deliveries.csv")
teams = pd.read_csv("Data/teams.csv")
players = pd.read_csv("Data/Players.csv")

EDA(Exploratory Data Analysis)

In [183]:
teams
Out[183]:
team1
0 Pune Warriors
1 Kolkata Knight Riders
2 Rajasthan Royals
3 Kochi Tuskers Kerala
4 Gujarat Lions
5 Chennai Super Kings
6 Rising Pune Supergiants
7 Delhi Daredevils
8 Deccan Chargers
9 Delhi Capitals
10 Mumbai Indians
11 Sunrisers Hyderabad
12 Rising Pune Supergiant
13 Royal Challengers Bangalore
14 Kings XI Punjab

There are 3 teams here that have changed their names.

  • Rising Pune Supergiants - Pune Warriors
  • Deccan Chargers - Sunrisers Hyderabad
  • Delhi Daredevils - Delhi Capitals

Remove duplicate data and rename all team to their abbreviation

In [184]:
teams=teams.replace({'Sunrisers Hyderabad':"SRH", 'Mumbai Indians': "MI", 'Gujarat Lions':"GL",
       'Rising Pune Supergiants': "Pune", 'Royal Challengers Bangalore': "RCB",
       'Kolkata Knight Riders': "KKR", 'Delhi Daredevils': "Delhi", 'Kings XI Punjab':"Punjab",
       'Chennai Super Kings':"CSK", 'Rajasthan Royals': "RR", 'Deccan Chargers':"SRH",
       'Kochi Tuskers Kerala':"KTK", 'Pune Warriors':"Pune", 'Delhi Capitals':"Delhi", 
        'Rising Pune Supergiant':"Pune"})

teams
Out[184]:
team1
0 Pune
1 KKR
2 RR
3 KTK
4 GL
5 CSK
6 Pune
7 Delhi
8 SRH
9 Delhi
10 MI
11 SRH
12 Pune
13 RCB
14 Punjab
In [185]:
home_away=home_away.replace({'Sunrisers Hyderabad':"SRH", 'Mumbai Indians': "MI", 'Gujarat Lions':"GL",
       'Rising Pune Supergiants': "Pune", 'Royal Challengers Bangalore': "RCB",
       'Kolkata Knight Riders': "KKR", 'Delhi Daredevils': "Delhi", 'Kings XI Punjab':"Punjab",
       'Chennai Super Kings':"CSK", 'Rajasthan Royals': "RR", 'Deccan Chargers':"SRH",
       'Kochi Tuskers Kerala':"KTK", 'Pune Warriors':"Pune", 'Delhi Capitals':"Delhi", 
        'Rising Pune Supergiant':"Pune"})
home_away
Out[185]:
team home_wins away_wins home_matches away_matches home_win_percentage away_win_percentage
0 Pune 5 5 8 8 62.500000 62.500000
1 MI 58 51 101 86 57.425743 59.302326
2 CSK 51 49 89 75 57.303371 65.333333
3 Delhi 3 7 6 10 50.000000 70.000000
4 SRH 30 28 63 45 47.619048 62.222222
5 RR 29 46 67 80 43.283582 57.500000
6 SRH 18 11 43 32 41.860465 34.375000
7 Punjab 38 44 91 85 41.758242 51.764706
8 RCB 35 49 85 95 41.176471 51.578947
9 KKR 34 58 83 95 40.963855 61.052632
10 Delhi 25 42 72 89 34.722222 47.191011
11 Pune 6 6 20 26 30.000000 23.076923
12 KTK 2 4 7 7 28.571429 57.142857
13 GL 1 12 14 16 7.142857 75.000000
In [186]:
matches=matches.replace({'Sunrisers Hyderabad':"SRH", 'Mumbai Indians': "MI", 'Gujarat Lions':"GL",
       'Rising Pune Supergiants': "Pune", 'Royal Challengers Bangalore': "RCB",
       'Kolkata Knight Riders': "KKR", 'Delhi Daredevils': "Delhi", 'Kings XI Punjab':"Punjab",
       'Chennai Super Kings':"CSK", 'Rajasthan Royals': "RR", 'Deccan Chargers':"SRH",
       'Kochi Tuskers Kerala':"KTK", 'Pune Warriors':"Pune", 'Delhi Capitals':"Delhi", 
        'Rising Pune Supergiant':"Pune"})

matches
Out[186]:
id Season city date team1 team2 toss_winner toss_decision result dl_applied winner win_by_runs win_by_wickets player_of_match venue umpire1 umpire2 umpire3
0 1 IPL-2017 Hyderabad 05-04-2017 SRH RCB RCB field normal 0 SRH 35 0 Yuvraj Singh Rajiv Gandhi International Stadium, Uppal AY Dandekar NJ Llong NaN
1 2 IPL-2017 Pune 06-04-2017 MI Pune Pune field normal 0 Pune 0 7 SPD Smith Maharashtra Cricket Association Stadium A Nand Kishore S Ravi NaN
2 3 IPL-2017 Rajkot 07-04-2017 GL KKR KKR field normal 0 KKR 0 10 CA Lynn Saurashtra Cricket Association Stadium Nitin Menon CK Nandan NaN
3 4 IPL-2017 Indore 08-04-2017 Pune Punjab Punjab field normal 0 Punjab 0 6 GJ Maxwell Holkar Cricket Stadium AK Chaudhary C Shamshuddin NaN
4 5 IPL-2017 Bangalore 08-04-2017 RCB Delhi RCB bat normal 0 RCB 15 0 KM Jadhav M Chinnaswamy Stadium NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
751 11347 IPL-2019 Mumbai 05-05-2019 KKR MI MI field normal 0 MI 0 9 HH Pandya Wankhede Stadium Nanda Kishore O Nandan S Ravi
752 11412 IPL-2019 Chennai 07-05-2019 CSK MI CSK bat normal 0 MI 0 6 AS Yadav M. A. Chidambaram Stadium Nigel Llong Nitin Menon Ian Gould
753 11413 IPL-2019 Visakhapatnam 08-05-2019 SRH Delhi Delhi field normal 0 Delhi 0 2 RR Pant ACA-VDCA Stadium NaN NaN NaN
754 11414 IPL-2019 Visakhapatnam 10-05-2019 Delhi CSK CSK field normal 0 CSK 0 6 F du Plessis ACA-VDCA Stadium Sundaram Ravi Bruce Oxenford Chettithody Shamshuddin
755 11415 IPL-2019 Hyderabad 12-05-2019 MI CSK MI bat normal 0 MI 1 0 JJ Bumrah Rajiv Gandhi Intl. Cricket Stadium Nitin Menon Ian Gould Nigel Llong

756 rows × 18 columns

In [187]:
most_runs
Out[187]:
batsman total_runs out numberofballs average strikerate
0 V Kohli 5426 152 4111 35.697368 131.987351
1 SK Raina 5386 160 3916 33.662500 137.538304
2 RG Sharma 4902 161 3742 30.447205 130.999466
3 DA Warner 4717 114 3292 41.377193 143.286756
4 S Dhawan 4601 137 3665 33.583942 125.538881
... ... ... ... ... ... ...
511 ND Doshi 0 1 13 0.000000 0.000000
512 J Denly 0 1 1 0.000000 0.000000
513 S Ladda 0 2 9 0.000000 0.000000
514 V Pratap Singh 0 1 1 0.000000 0.000000
515 S Kaushik 0 1 1 0.000000 0.000000

516 rows × 6 columns

In [188]:
deliveries
Out[188]:
match_id inning batting_team bowling_team over ball batsman non_striker bowler is_super_over ... bye_runs legbye_runs noball_runs penalty_runs batsman_runs extra_runs total_runs player_dismissed dismissal_kind fielder
0 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 1 DA Warner S Dhawan TS Mills 0 ... 0 0 0 0 0 0 0 NaN NaN NaN
1 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 2 DA Warner S Dhawan TS Mills 0 ... 0 0 0 0 0 0 0 NaN NaN NaN
2 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 3 DA Warner S Dhawan TS Mills 0 ... 0 0 0 0 4 0 4 NaN NaN NaN
3 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 4 DA Warner S Dhawan TS Mills 0 ... 0 0 0 0 0 0 0 NaN NaN NaN
4 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 5 DA Warner S Dhawan TS Mills 0 ... 0 0 0 0 0 2 2 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
179073 11415 2 Chennai Super Kings Mumbai Indians 20 2 RA Jadeja SR Watson SL Malinga 0 ... 0 0 0 0 1 0 1 NaN NaN NaN
179074 11415 2 Chennai Super Kings Mumbai Indians 20 3 SR Watson RA Jadeja SL Malinga 0 ... 0 0 0 0 2 0 2 NaN NaN NaN
179075 11415 2 Chennai Super Kings Mumbai Indians 20 4 SR Watson RA Jadeja SL Malinga 0 ... 0 0 0 0 1 0 1 SR Watson run out KH Pandya
179076 11415 2 Chennai Super Kings Mumbai Indians 20 5 SN Thakur RA Jadeja SL Malinga 0 ... 0 0 0 0 2 0 2 NaN NaN NaN
179077 11415 2 Chennai Super Kings Mumbai Indians 20 6 SN Thakur RA Jadeja SL Malinga 0 ... 0 0 0 0 0 0 0 SN Thakur lbw NaN

179078 rows × 21 columns

In [ ]:
 
In [189]:
players
Out[189]:
Player_Name DOB Batting_Hand Bowling_Skill Country
0 A Ashish Reddy 24-Feb-91 Right_Hand Right-arm medium India
1 A Chandila 05-Dec-83 Right_Hand Right-arm offbreak India
2 A Chopra 19-Sep-77 Right_Hand Right-arm offbreak India
3 A Choudhary NaN Right_hand Left-arm fast-medium NaN
4 A Dananjaya NaN Right_Hand Right-arm offbreak NaN
... ... ... ... ... ...
561 Younis Khan 29-Nov-77 Right_Hand Right-arm medium Pakistan
562 YS Chahal 23-Jul-90 Right_Hand Legbreak googly India
563 Yuvraj Singh 12-Dec-81 Left_Hand Slow left-arm orthodox India
564 YV Takawale 05-Nov-84 Right_Hand NaN India
565 Z Khan 07-Oct-78 Right_Hand Left-arm fast-medium India

566 rows × 5 columns

Number of Matches won by a team

In [190]:
most_wins_df=matches['winner'].value_counts()
most_wins_df=pd.DataFrame(most_wins_df)

plt.figure(figsize=(12,8))
sns.barplot(x = most_wins_df.index,y = most_wins_df.winner)
plt.ylabel("Number of Matches Won",fontsize = 14)
plt.xticks(fontsize = 14)
plt.title("NUMBER OF MATCHES WON BY A TEAM",{"fontsize":16});

Observation

  • Highest number of matches is won by Mumbai Indians
In [191]:
toss_winner = matches['toss_winner'].value_counts().reset_index()
toss_winner
sns.set(rc={'figure.figsize':(15,8)})
ax = plt.axes()
ax.set(facecolor = 'lightblue')
plt.title(' Number Of Tosses Win By Team',fontsize = 20)
sns.barplot(y = toss_winner['index'] ,x = toss_winner['toss_winner'],orient = 'h',palette = 'cubehelix')
plt.xlabel('Total Toss Wins')
plt.ylabel('Teams')
plt.show()

Observation

  • Highest number of toss win by Mumbai Indians
In [192]:
plt.figure(figsize = (15,10))
ax = plt.axes()
ax.grid(False)
sns.countplot(data = matches,x = 'Season',hue = 'toss_decision',linewidth=5)
plt.xticks(rotation  = 90)
plt.title('Toss decisions by season',fontsize = 20)
plt.xlabel('Seasons')
plt.ylabel('Count')
plt.show()

Observation

  • In most of the season, the toss winning team take fielding first
In [193]:
plt.figure(figsize = (12,10))
ax = plt.axes()

ax.grid(False)
ax = sns.countplot(data =matches, x = 'venue', order = matches['venue'].value_counts().index[0:10], palette = 'gnuplot',linewidth=5)

for p in ax.patches:
    ax.text(p.get_x() + p.get_width()/2., p.get_height(), '%d' % int(p.get_height()), 
            fontsize=15, color='black', ha='center', va='bottom')

plt.xlabel('Venues',fontsize = 15)
plt.ylabel('Total Matches',fontsize = 15)
plt.xticks(rotation = 90, fontsize = 12)
plt.yticks(fontsize = 12)
plt.title('Top 10 host venues',fontsize = 20)
plt.show()

Observation

  • Highest number of matches is hosted in Eden Gardens
In [194]:
most_mom_df = pd.DataFrame(data = matches["player_of_match"].value_counts())
most_mom_df = most_mom_df.reset_index()
most_mom_df = most_mom_df.rename(columns={"index":"player","player_of_match":"num of time won"})
most_mom_df = most_mom_df[:10]
most_mom_df
Out[194]:
player num of time won
0 CH Gayle 21
1 AB de Villiers 20
2 RG Sharma 17
3 MS Dhoni 17
4 DA Warner 17
5 YK Pathan 16
6 SR Watson 15
7 SK Raina 14
8 G Gambhir 13
9 V Kohli 12
In [195]:
plt.figure(figsize=(12,6))
sns.barplot(x = most_mom_df.player, y = most_mom_df["num of time won"])
plt.ylabel("Number of times won",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 90)
plt.title("MOST MAN OF THE MATCH AWARDS",{"fontsize":16});

Observation

  • We can see that most of the Man of the Match award is won by Gayle
  • AB de Villiers is on 2nd position
  • MS Dhoni and RG Sharma are on 3rd position
In [196]:
season_column = matches.Season.unique()   
season_column = np.sort(season_column)
In [197]:
  highest_win_percentage_by_year = []
  team_with_highest_win_percentage = []
In [198]:
for season in season_column:

  df_by_year = matches[matches["Season"] == season]
  team_most_wins = df_by_year.winner.value_counts().index[0]
  number_of_wins = df_by_year.winner.value_counts()[0]
  matches_played = len(df_by_year[(df_by_year.team1 == team_most_wins)|(df_by_year.team2 == team_most_wins)])
  percentage = np.round(number_of_wins*100/matches_played,2)
  highest_win_percentage_by_year.append(percentage)
  team_with_highest_win_percentage.append(team_most_wins)
In [199]:
highest_win_percentage_by_year
Out[199]:
[81.25,
 66.67,
 68.75,
 68.75,
 70.59,
 68.42,
 70.59,
 62.5,
 64.71,
 70.59,
 68.75,
 68.75]
In [200]:
team_with_highest_win_percentage
Out[200]:
['RR',
 'Delhi',
 'MI',
 'CSK',
 'KKR',
 'MI',
 'Punjab',
 'MI',
 'SRH',
 'MI',
 'CSK',
 'MI']
In [201]:
most_successful_team_by_year = pd.DataFrame({"Season" : season_column,"team" : team_with_highest_win_percentage,"win_percentage":highest_win_percentage_by_year})
In [202]:
most_successful_team_by_year
Out[202]:
Season team win_percentage
0 IPL-2008 RR 81.25
1 IPL-2009 Delhi 66.67
2 IPL-2010 MI 68.75
3 IPL-2011 CSK 68.75
4 IPL-2012 KKR 70.59
5 IPL-2013 MI 68.42
6 IPL-2014 Punjab 70.59
7 IPL-2015 MI 62.50
8 IPL-2016 SRH 64.71
9 IPL-2017 MI 70.59
10 IPL-2018 CSK 68.75
11 IPL-2019 MI 68.75
In [203]:
for index,data in enumerate(most_successful_team_by_year.win_percentage):
  print(index,data)
0 81.25
1 66.67
2 68.75
3 68.75
4 70.59
5 68.42
6 70.59
7 62.5
8 64.71
9 70.59
10 68.75
11 68.75
In [204]:
plt.figure(figsize=(18,6))
sns.barplot(data = most_successful_team_by_year,x = "Season" ,y = "win_percentage")
plt.xlabel("Year",fontsize = 16)
plt.ylabel("Win Percentage",fontsize = 16)
for index,data in enumerate(most_successful_team_by_year.win_percentage):
  plt.text(x=index - 0.25,y=data+1,s=f"{data}%",fontsize = 14)
for index,data in enumerate(most_successful_team_by_year.team):
  plt.text(x=index - 0.35,y=0.6,s=f"{data}",fontsize = 16)
plt.xticks(fontsize=12)
plt.title("MOST SUCCESSFUL TEAM BY YEAR",fontsize=18,pad=30);

Observation

  • Most successful team is RR in 2008 season
  • In IPL 2009 winning team is DC or SRH but most successful team is Delhi
  • In IPL 2014 winning team is KKR but most successful team is Punjab
In [205]:
final = matches.groupby('Season').tail(1)
final['winner'].value_counts()

plt.figure(figsize=(10,8))
ax = plt.axes()
ax.set(facecolor = 'grey')
ax.grid(False)


sns.countplot(x=final['winner'],order = final['winner'].value_counts().index, linewidth = 5, palette = 'gist_ncar')
plt.title("IPL Champions",fontsize=20)
plt.xlabel('Teams',fontsize=20)
plt.ylabel('No.of Trophy',fontsize=20)
plt.xticks(rotation='0')
plt.show()

Observation

  • MI 4 times winning team
  • CSK 3 times winning team
  • KKR and SRH 2 times winning teams
  • RR 1 time winning team
In [206]:
toss_factor = matches.toss_winner == matches.winner
In [207]:
toss_factor.value_counts()
Out[207]:
True     393
False    363
dtype: int64
In [208]:
toss_data = { "Matches_won_by_toss_winner" : 393,
              "Matches_won_by_toss_looser" : 363}
toss_data = pd.Series(toss_data)
In [209]:
toss_data.index
Out[209]:
Index(['Matches_won_by_toss_winner', 'Matches_won_by_toss_looser'], dtype='object')
In [210]:
plt.figure(figsize=(7,7))
plt.pie(x=toss_data,autopct="%.2f%%",explode=[0.03]*2,labels=toss_data.index);

Observation

  • Matches won by toss winner 51.98%
  • Matches won by toss looser 48.02%
In [211]:
bat_first = matches.venue[matches['result'] != 'wickets'].value_counts()[0:5].reset_index()
bat_first

plt.figure(figsize = (10,8))
ax = plt.axes()
ax.grid(False)
ax.set(facecolor = 'grey')
ax = sns.barplot(data =bat_first, x = 'venue', y = 'index',linewidth=5)

plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.title('Top 5 venues to win games batting first')
plt.xlabel('Total Wins')
plt.ylabel('Venues')
plt.show()

Observation

  • In Eden Garden probability winning match who batting 1st is highest
In [212]:
bat_second = matches.venue[matches['result'] != 'runs'].value_counts()[0:5].reset_index()
plt.figure(figsize = (10,8))
ax = plt.axes()
ax.grid(False)
ax.set(facecolor = 'lightblue')
ax = sns.barplot(data =bat_second, x = 'venue', y = 'index', palette = 'seismic',linewidth=5)
plt.title('Top 5 venues to win games batting second')
plt.xlabel('Total Wins')
plt.ylabel('Venues')
plt.show()

Observation

  • In Eden Garden probability winning match who batting 2nd is highest
In [213]:
most_runs
Out[213]:
batsman total_runs out numberofballs average strikerate
0 V Kohli 5426 152 4111 35.697368 131.987351
1 SK Raina 5386 160 3916 33.662500 137.538304
2 RG Sharma 4902 161 3742 30.447205 130.999466
3 DA Warner 4717 114 3292 41.377193 143.286756
4 S Dhawan 4601 137 3665 33.583942 125.538881
... ... ... ... ... ... ...
511 ND Doshi 0 1 13 0.000000 0.000000
512 J Denly 0 1 1 0.000000 0.000000
513 S Ladda 0 2 9 0.000000 0.000000
514 V Pratap Singh 0 1 1 0.000000 0.000000
515 S Kaushik 0 1 1 0.000000 0.000000

516 rows × 6 columns

In [214]:
most_runs_df = most_runs[:10]
In [215]:
plt.figure(figsize=(12,6))
sns.barplot(x = most_runs_df.batsman, y = most_runs_df["strikerate"])
plt.ylabel("Total score",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 30)
plt.title("STRIKE RATE",{"fontsize":16});

Observation

  • Highest Strike Rate holder are CH Gayle and AB de Villiers
In [216]:
plt.figure(figsize=(12,6))
sns.barplot(x = most_runs_df.batsman, y = most_runs_df["total_runs"])
plt.ylabel("Total score",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 30)
plt.title("TOP 10 BATSMEN SCORE",{"fontsize":16});

Observation

  • V Kholi score highest run in whole IPL till now
In [217]:
fours = deliveries.batsman[deliveries['batsman_runs'] == 4 ].value_counts()[0:10].reset_index()

plt.figure(figsize = (15,8))
ax = plt.axes()
ax.grid(False)
ax.set(facecolor = 'grey')
ax = sns.barplot(data = fours, x = 'index', y = 'batsman', palette = 'Oranges',linewidth=5,)

plt.ylabel('Total no. of fours')
plt.xlabel('Batsman')
plt.xticks(fontsize=12,rotation = 0)
plt.yticks(fontsize=12)
plt.title('Top 5 batsman with most no. of fours')


plt.show()

Observation

  • S Dhawan have highest record of hitting fours
In [218]:
six = deliveries.batsman[deliveries['batsman_runs'] == 6 ].value_counts()[0:10].reset_index()

plt.figure(figsize = (15,8))
ax = plt.axes()
ax.grid(False)
ax.set(facecolor = 'grey')
ax = sns.barplot(data = six, x = 'index', y = 'batsman', palette = 'Oranges',linewidth=5,)

plt.ylabel('Total no. of fours')
plt.xlabel('Batsman')
plt.xticks(fontsize=12,rotation = 90)
plt.yticks(fontsize=12)
plt.title('Top 5 batsman with most no. of sixes')


plt.show()

Observation

  • CH Gayle have highest record of hitting sixes
In [219]:
plt.figure(figsize=(12,6))
sns.barplot(x = home_away.team, y = home_away["home_win_percentage"])
plt.ylabel("Percentage ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 0)
plt.title("HOME WIN PERCENTAGE",{"fontsize":16});

Observation

  • Most of the home ground matches are won by Pune
In [220]:
plt.figure(figsize=(12,6))
sns.barplot(x = home_away.team, y = home_away["away_win_percentage"])
plt.ylabel("Percentage ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 0)
plt.title("AWAY WIN PERCENTAGE",{"fontsize":16});

Observation

  • Most of the other ground matches are won by GL
In [221]:
plt.figure(figsize=(20,10))
plt.bar(home_away['team'],home_away['home_matches'],label="MATCHES",color='r',width=.5)
plt.bar(home_away['team'],home_away['home_wins'],label="WON", color='b',width=.5)
plt.legend()
plt.ylabel("Count ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 0)
plt.title("Home Matches and wins",{"fontsize":16});
plt.xticks(rotation=0)
plt.show()

Observation

  • MI played highest match and MI aslo won highest match
In [222]:
players=players.replace({'Right-arm Medium':"Right-arm medium",'Left-arm fast-medium':"Left-arm medium-fast",
                         'Right-arm fast-medium':"Left-arm medium-fast","Right_hand":'Right_Hand'})
In [223]:
plt.figure(figsize=(12,6))
sns.countplot(x="Bowling_Skill", data=players)
plt.ylabel("No. of Player ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 90)
plt.title("BOWLING SKILL",{"fontsize":16});

Observation

  • Most of the bowler have Right-arm Medium bowling skill
In [224]:
plt.figure(figsize=(12,6))
sns.countplot(x="Batting_Hand", data=players)
plt.ylabel("No. of Player ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 0)
plt.title("BATTING HAND",{"fontsize":16});

Observation

  • Most of the batsmen have use right hand to play
In [225]:
players['Country'].value_counts().plot.bar(width=0.9,color='blue',alpha=0.75)
plt.xlabel('Countries',fontsize = 16)
plt.title("Countries vs Number of Players",{"fontsize":16});
plt.ylabel("Number of Players ",fontsize = 16)
plt.xticks(fontsize = 16,rotation = 30)
plt.show()

Observation

  • Most of the player are from India only
In [226]:
df_match_deliver = matches[['id','Season']].merge(deliveries, left_on = 'id', right_on = 'match_id', how = 'left').drop('id', axis = 1)
In [227]:
df_match_deliver
Out[227]:
Season match_id inning batting_team bowling_team over ball batsman non_striker bowler ... bye_runs legbye_runs noball_runs penalty_runs batsman_runs extra_runs total_runs player_dismissed dismissal_kind fielder
0 IPL-2017 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 1 DA Warner S Dhawan TS Mills ... 0 0 0 0 0 0 0 NaN NaN NaN
1 IPL-2017 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 2 DA Warner S Dhawan TS Mills ... 0 0 0 0 0 0 0 NaN NaN NaN
2 IPL-2017 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 3 DA Warner S Dhawan TS Mills ... 0 0 0 0 4 0 4 NaN NaN NaN
3 IPL-2017 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 4 DA Warner S Dhawan TS Mills ... 0 0 0 0 0 0 0 NaN NaN NaN
4 IPL-2017 1 1 Sunrisers Hyderabad Royal Challengers Bangalore 1 5 DA Warner S Dhawan TS Mills ... 0 0 0 0 0 2 2 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
179073 IPL-2019 11415 2 Chennai Super Kings Mumbai Indians 20 2 RA Jadeja SR Watson SL Malinga ... 0 0 0 0 1 0 1 NaN NaN NaN
179074 IPL-2019 11415 2 Chennai Super Kings Mumbai Indians 20 3 SR Watson RA Jadeja SL Malinga ... 0 0 0 0 2 0 2 NaN NaN NaN
179075 IPL-2019 11415 2 Chennai Super Kings Mumbai Indians 20 4 SR Watson RA Jadeja SL Malinga ... 0 0 0 0 1 0 1 SR Watson run out KH Pandya
179076 IPL-2019 11415 2 Chennai Super Kings Mumbai Indians 20 5 SN Thakur RA Jadeja SL Malinga ... 0 0 0 0 2 0 2 NaN NaN NaN
179077 IPL-2019 11415 2 Chennai Super Kings Mumbai Indians 20 6 SN Thakur RA Jadeja SL Malinga ... 0 0 0 0 0 0 0 SN Thakur lbw NaN

179078 rows × 22 columns

In [234]:
df = df_match_deliver[df_match_deliver['batsman_runs'] == 6]
sixes_by_season = df.groupby('Season')['batsman_runs'].count().reset_index()


plt.figure(figsize = (12,10))
ax = plt.axes()
ax.grid(False)

ax = sns.barplot(data =sixes_by_season, x = 'Season', y= 'batsman_runs',linewidth=3,)

for index, row in sixes_by_season.iterrows():
    ax.text(row.name,row.batsman_runs, row.batsman_runs, color='purple', ha="center",size = 20)

plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.title('Total sixes by season')
plt.xlabel('Season')
plt.ylabel('Total Sixes')

plt.show()

Observation

  • Most Sixes Hits in 2018 (869 sixes)
In [265]:
d={}
i=0
df=df_match_deliver
lst = deliveries['dismissal_kind'].unique()
data = df[df['dismissal_kind'].apply(lambda x: True if x in lst and x != ' ' else False)].groupby(['Season','bowler']).count()['ball']
data=data.sort_values(ascending=False)[:30].sort_index(level=0)
val=0
lst=[]
for (season,bowler),wicket in data.items():
    if season == val:
        lst.append(wicket)        
    else:
        d[i]= [season,bowler,wicket]
        i+=1
        val = season
        lst=[]
wicket=pd.DataFrame.from_dict(d, orient='index',columns=['Year', 'Player', 'Wicket'])
wicket
Out[265]:
Year Player Wicket
0 IPL-2008 Sohail Tanvir 24
1 IPL-2009 RP Singh 26
2 IPL-2011 SL Malinga 30
3 IPL-2012 M Morkel 30
4 IPL-2013 A Mishra 24
5 IPL-2014 MM Sharma 26
6 IPL-2015 A Nehra 25
7 IPL-2016 B Kumar 24
8 IPL-2017 B Kumar 28
9 IPL-2018 AJ Tye 28
10 IPL-2019 Imran Tahir 26
In [267]:
plt.figure(figsize=(12,6))
sns.barplot(x = wicket.Player, y = wicket["Wicket"])
plt.ylabel("Wicket ",fontsize = 16)
plt.xticks(fontsize = 13,rotation = 0)
plt.title("Purple Cap Winner",{"fontsize":16});

Observation

  • Most successful purple cap winner are SL Malinga and M Morkel in IPL-2011 and IPL-2012 respectively.
In [230]:
data = df_match_deliver.groupby(['Season','batsman'])['batsman_runs'].sum().reset_index()
data.sort_values('batsman_runs',ascending=False,inplace = True)
data.drop_duplicates(subset=["Season"],keep="first",inplace = True)

fig = px.bar(data, x='batsman', y='batsman_runs',text ='Season',color = 'batsman')

fig.update_layout(
    height=500,
    title_text='Orange Cap Winners',
    xaxis =dict(title = 'Season'),
    yaxis = dict(title = 'Runs'),
)
fig.show()

Observation

  • Most of the successful batsmen who own Orange cap is DA Warner in year 2015,2017,2019

Conclusion

  • Highest number of matches is won by Mumbai Indians
  • Highest number of toss win by Mumbai Indians
  • In most of the season, the toss winning team take fielding first
  • Highest number of matches is hosted in Eden Gardens
  • Most of the Man of the Match award is won by Gayle
  • Winning teams:
    • MI 4 times winning team
    • CSK 3 times winning team
    • KKR and SRH 2 times winning teams
    • RR 1 time winning team
  • Toss winner
    • Matches won by toss winner 51.98%
    • Matches won by toss looser 48.02%
  • In Eden Garden probability winning match who batting 1st is highest.
  • In Eden Garden probability winning match who batting 2nd is highest.
  • Highest Strike Rate holder are CH Gayle and AB de Villiers
  • V Kholi score highest run in whole IPL till now
  • S Dhawan have highest record of hitting fours
  • CH Gayle have highest record of hitting sixes
  • Most of the home ground matches are won by Pune
  • Most of the other ground matches are won by GL
  • MI played highest match and MI aslo won highest match
  • Most of the bowler have Right-arm Medium bowling skill
  • Most of the batsmen have use right hand to play
  • Most of the player are from India only
  • Most of the player are from India only
  • Most successful purple cap winner are SL Malinga and M Morkel in IPL-2011 and IPL-2012 respectively.
  • Most of the successful batsmen who own Orange cap is DA Warner in year 2015,2017,2019
  • Most successful team is RR

Most successful purple cap winner are SL Malinga and M Morkel in IPL-2011 and IPL-2012 respectively.

Most of the successful batsmen who own Orange cap is DA Warner in year 2015,2017,2019

Most successful team is RR

In [ ]: